Consensus mapping for the immune atlas

Authors
Affiliations

SAiGENCI, The University of Adelaide

South Australian Health and Medical Research Institute (SAHMRI)

Walter and Eliza Hall Institute

Stefano Mangiola

SAiGENCI, The University of Adelaide

Walter and Eliza Hall Institute

Published

October 24, 2024

Abstract

Power analysis

Load data

Compute data-driven consensus

Cell types mapped

Mapping summaries

Here we break the links from cytotoxic cells to other cytotoxic cell types such as nk, ilc, tgd, and t cd8. These are useful for retaining higher level annotations but we want them to be an independent tree for the purposes of determining depth of annotation.

In a compositional analysis, we will be selecting a specific level of the immune tree hierarchy. Below we collapse the nodes of the tree to a specific level and assess how many cells remain annotated at each level. As we go for finer resolutions, we get fewer cells annotated. The plot below shows that even at the finest resolution (L3), we get ~10M cells annotated.

Resolving missing values at deeper annotation levels

Some cells will have missing annotations at specific levels. This is because neither the original annotation, nor our annotation were able to further resolve their type in the cell type hierarcy. This can happen in three possible scenarios:

  1. Annotation is missing using one approach and the other is only able to provide a high level annotation.
  2. The low level annotation is the best possible using both approaches (one might have a higher level annotation).
  3. Cell types called using both approaches end up being siblings therefore the parent annotation is the best we can achieve.

# A tibble: 31 × 3
   cell_type_unified cell_type_unified_ensemble  NCells
   <chr>             <chr>                        <dbl>
 1 other             other                      2039871
 2 macrophage        other                        72296
 3 ilc               other                        21110
 4 t                 other                        19562
 5 t cd4             other                        15339
 6 nk                other                        10242
 7 cd8 tcm           other                         7332
 8 plasma            other                         6629
 9 mast              other                         6544
10 monocytic         other                         6314
11 treg              other                         5957
12 cd4 naive         other                         5573
13 dc                other                         5247
14 cdc               other                         3956
15 b                 other                         2753
16 erythrocyte       other                         2155
17 cytotoxic         other                         1939
18 t cd8             other                         1938
19 nkt               other                         1005
20 granulocyte       other                          870
21 pdc               other                          835
22 cd8 tem           other                          362
23 b naive           other                          199
24 cd4 tem           other                          107
25 cd14 mono         other                           93
26 b memory          other                           84
27 tgd               other                           27
28 cd4 tcm           other                           14
29 cd8 naive         other                           12
30 mait              other                            3
31 cd4 th1 em        other                            1

Dataset summaries

Summary per dataset

Summary per organ

Session Info

R version 4.4.0 (2024-04-24)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Australia/Sydney
tzcode source: internal

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] ggalluvial_0.12.5        scico_1.5.0              ggraph_2.2.1            
 [4] tidygraph_1.3.1          ComplexHeatmap_2.20.0    tidyHeatmap_1.10.2      
 [7] DT_0.33                  CuratedAtlasQueryR_1.3.6 igraph_2.1.1            
[10] BiocParallel_1.38.0      patchwork_1.3.0          arrow_17.0.0.1          
[13] duckdb_1.1.1             DBI_1.2.3                lubridate_1.9.3         
[16] forcats_1.0.0            stringr_1.5.1            dplyr_1.1.4             
[19] purrr_1.0.2              readr_2.1.5              tidyr_1.3.1             
[22] tibble_3.2.1             ggplot2_3.5.1            tidyverse_2.0.0         

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.22            splines_4.4.0              
  [3] later_1.3.2                 polyclip_1.10-7            
  [5] fastDummies_1.7.4           lifecycle_1.0.4            
  [7] doParallel_1.0.17           globals_0.16.3             
  [9] lattice_0.22-6              MASS_7.3-61                
 [11] dendextend_1.18.1           backports_1.5.0            
 [13] magrittr_2.0.3              plotly_4.10.4              
 [15] rmarkdown_2.28              yaml_2.3.10                
 [17] httpuv_1.6.15               Seurat_5.1.0               
 [19] sctransform_0.4.1           spam_2.11-0                
 [21] sp_2.1-4                    spatstat.sparse_3.1-0      
 [23] reticulate_1.39.0           cowplot_1.1.3              
 [25] pbapply_1.7-2               RColorBrewer_1.1-3         
 [27] abind_1.4-8                 zlibbioc_1.50.0            
 [29] Rtsne_0.17                  GenomicRanges_1.56.2       
 [31] BiocGenerics_0.50.0         tweenr_2.0.3               
 [33] circlize_0.4.16             GenomeInfoDbData_1.2.12    
 [35] IRanges_2.38.1              S4Vectors_0.42.1           
 [37] ggrepel_0.9.6               irlba_2.3.5.1              
 [39] listenv_0.9.1               spatstat.utils_3.1-0       
 [41] goftest_1.2-3               RSpectra_0.16-2            
 [43] spatstat.random_3.3-2       fitdistrplus_1.2-1         
 [45] parallelly_1.38.0           leiden_0.4.3.1             
 [47] codetools_0.2-20            DelayedArray_0.30.1        
 [49] ggforce_0.4.2               shape_1.4.6.1              
 [51] tidyselect_1.2.1            UCSC.utils_1.0.0           
 [53] farver_2.1.2                viridis_0.6.5              
 [55] matrixStats_1.4.1           stats4_4.4.0               
 [57] spatstat.explore_3.3-3      jsonlite_1.8.9             
 [59] GetoptLong_1.0.5            progressr_0.14.0           
 [61] iterators_1.0.14            ggridges_0.5.6             
 [63] survival_3.7-0              foreach_1.5.2              
 [65] tools_4.4.0                 ica_1.0-3                  
 [67] Rcpp_1.0.13                 glue_1.8.0                 
 [69] gridExtra_2.3               SparseArray_1.4.8          
 [71] xfun_0.48                   MatrixGenerics_1.16.0      
 [73] ggthemes_5.1.0              GenomeInfoDb_1.40.1        
 [75] HDF5Array_1.32.1            withr_3.0.1                
 [77] fastmap_1.2.0               rhdf5filters_1.16.0        
 [79] fansi_1.0.6                 digest_0.6.37              
 [81] timechange_0.3.0            R6_2.5.1                   
 [83] mime_0.12                   colorspace_2.1-1           
 [85] Cairo_1.6-2                 scattermore_1.2            
 [87] tensor_1.5                  spatstat.data_3.1-2        
 [89] utf8_1.2.4                  generics_0.1.3             
 [91] data.table_1.16.2           graphlayouts_1.2.0         
 [93] httr_1.4.7                  htmlwidgets_1.6.4          
 [95] S4Arrays_1.4.1              uwot_0.2.2                 
 [97] pkgconfig_2.0.3             gtable_0.3.5               
 [99] blob_1.2.4                  lmtest_0.9-40              
[101] SingleCellExperiment_1.26.0 XVector_0.44.0             
[103] htmltools_0.5.8.1           vissE_1.12.0               
[105] dotCall64_1.2               clue_0.3-65                
[107] SeuratObject_5.0.2          scales_1.3.0               
[109] Biobase_2.64.0              png_0.1-8                  
[111] spatstat.univar_3.0-1       knitr_1.48                 
[113] rjson_0.2.23                tzdb_0.4.0                 
[115] reshape2_1.4.4              checkmate_2.3.2            
[117] nlme_3.1-166                cachem_1.1.0               
[119] GlobalOptions_0.1.2         zoo_1.8-12                 
[121] rhdf5_2.48.0                KernSmooth_2.23-24         
[123] parallel_4.4.0              miniUI_0.1.1.1             
[125] pillar_1.9.0                vctrs_0.6.5                
[127] RANN_2.6.2                  promises_1.3.0             
[129] dbplyr_2.5.0                xtable_1.8-4               
[131] cluster_2.1.6               evaluate_1.0.1             
[133] magick_2.8.5                cli_3.6.3                  
[135] compiler_4.4.0              rlang_1.1.4                
[137] crayon_1.5.3                future.apply_1.11.2        
[139] labeling_0.4.3              plyr_1.8.9                 
[141] stringi_1.8.4               viridisLite_0.4.2          
[143] deldir_2.0-4                assertthat_0.2.1           
[145] munsell_0.5.1               lazyeval_0.2.2             
[147] spatstat.geom_3.3-3         Matrix_1.7-1               
[149] RcppHNSW_0.6.0              hms_1.1.3                  
[151] bit64_4.5.2                 future_1.34.0              
[153] Rhdf5lib_1.26.0             shiny_1.9.1                
[155] SummarizedExperiment_1.34.0 ROCR_1.0-11                
[157] memoise_2.0.1               bit_4.5.0